Lianyungang
The Road Less Traveled: Enhancing Exploration in LLMs via Sequential Sampling
Reinforcement learning (RL) has been pivotal in enhancing the reasoning capabilities of large language models (LLMs), but it often suffers from limited exploration and entropy collapse, where models exploit a narrow set of solutions, leading to a loss of sampling diversity and subsequently preventing RL from further improving performance. This issue is exacerbated in parallel sampling methods, where multiple outputs are drawn from the same distribution, potentially causing the model to converge to similar solutions. We propose SESA, a novel SEquential SAmpling framework that mitigates this challenge by generating diverse solution sketches sequentially before expanding them into full reasoning paths. This approach ensures broader exploration by conditioning each new output on previous ones, promoting diversity throughout the process and preventing policy collapse. Our experiments on a synthetic task show that sequential sampling consistently outperforms traditional RL methods in terms of path diversity and recovery from collapse. Further evaluations on real-world tasks demonstrate that SESA improves both the exploration of valid strategies and the overall performance of LLMs. On three agent benchmarks, SESA lifts success rates by $+0.25$, $+0.42$, and $+0.07$ absolute over the base model (up to an additional $211\%$ relative improvement over baseline RL), underscoring its exploration advantage. This work introduces a structured approach to exploration, paving the way for more effective and diverse reasoning in RL-trained LLMs. Our code is released at https://github.com/MuLabPKU/sesa.
- Asia > China > Guangdong Province > Shenzhen (0.05)
- Asia > China > Heilongjiang Province > Harbin (0.04)
- Asia > China > Tianjin Province > Tianjin (0.04)
- (13 more...)
A Disease-Centric Vision-Language Foundation Model for Precision Oncology in Kidney Cancer
Tao, Yuhui, Zhao, Zhongwei, Wang, Zilong, Luo, Xufang, Chen, Feng, Wang, Kang, Wu, Chuanfu, Zhang, Xue, Zhang, Shaoting, Yao, Jiaxi, Jin, Xingwei, Jiang, Xinyang, Yang, Yifan, Li, Dongsheng, Qiu, Lili, Shao, Zhiqiang, Guo, Jianming, Yu, Nengwang, Wang, Shuo, Xiong, Ying
The non-invasive assessment of increasingly incidentally discovered renal masses is a critical challenge in urologic oncology, where diagnostic uncertainty frequently leads to the overtreatment of benign or indolent tumors. In this study, we developed and validated RenalCLIP using a dataset of 27,866 CT scans from 8,809 patients across nine Chinese medical centers and the public TCIA cohort, a visual-language foundation model for characterization, diagnosis and prognosis of renal mass. The model was developed via a two-stage pre-training strategy that first enhances the image and text encoders with domain-specific knowledge before aligning them through a contrastive learning objective, to create robust representations for superior generalization and diagnostic precision. RenalCLIP achieved better performance and superior generalizability across 10 core tasks spanning the full clinical workflow of kidney cancer, including anatomical assessment, diagnostic classification, and survival prediction, compared with other state-of-the-art general-purpose CT foundation models. Especially, for complicated task like recurrence-free survival prediction in the TCIA cohort, RenalCLIP achieved a C-index of 0.726, representing a substantial improvement of approximately 20% over the leading baselines. Furthermore, RenalCLIP's pre-training imparted remarkable data efficiency; in the diagnostic classification task, it only needs 20% training data to achieve the peak performance of all baseline models even after they were fully fine-tuned on 100% of the data. Additionally, it achieved superior performance in report generation, image-text retrieval and zero-shot diagnosis tasks. Our findings establish that RenalCLIP provides a robust tool with the potential to enhance diagnostic accuracy, refine prognostic stratification, and personalize the management of patients with kidney cancer.
- Asia > China > Fujian Province > Xiamen (0.05)
- Asia > China > Shanghai > Shanghai (0.04)
- Asia > China > Jiangsu Province > Lianyungang (0.04)
- (3 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area > Oncology > Kidney Cancer (1.00)
- Health & Medicine > Therapeutic Area > Nephrology (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Decomposability-Guaranteed Cooperative Coevolution for Large-Scale Itinerary Planning
Zhang, Ziyu, Xu, Peilan, Sun, Yuetong, Shi, Yuhui, Luo, Wenjian
--Large-scale itinerary planning is a variant of the traveling salesman problem, aiming to determine an optimal path that maximizes the collected points of interest (POIs) scores while minimizing travel time and cost, subject to travel duration constraints. This paper analyzes the decomposability of large-scale itinerary planning, proving that strict decomposability is difficult to satisfy, and introduces a weak decomposability definition based on a necessary condition, deriving the corresponding graph structures that fulfill this property. With decomposability guaranteed, we propose a novel multi-objective cooperative coevolutionary algorithm for large-scale itinerary planning, addressing the challenges of component imbalance and interactions. Specifically, we design a dynamic decomposition strategy based on the normalized fitness within each component, define optimization potential considering component scale and contribution, and develop a computational resource allocation strategy. Finally, we evaluate the proposed algorithm on a set of real-world datasets. Comparative experiments with state-of-the-art multi-objective itinerary planning algorithms demonstrate the superiority of our approach, with performance advantages increasing as the problem scale grows. Itinerary planning is a class of the orienteering problem, where a traveler aims to determine an optimal route within a city under given duration constraints, selecting a subset of points of interest (POIs) to maximize the total collected score [1]. It can be seen as a variant of the traveling salesman problem (TSP) and a combination of the knapsack problem and TSP [2]. As a real-world application, itinerary planning not only seeks to maximize the overall travel experience, i.e., the total collected score, but also considers objectives such as minimizing travel time and cost. This work is partly supported by the Natural Science Foundation of Jiangsu Province (Grant No. BK20230419), the Natural Science Foundation of the Jiangsu Higher Education Institutions of China (Grant No. 23KJB520018) and the National Natural Science Foundation of China (Grant No. U23B2058). Wenjian Luo is with the School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, Guangdong, China.
- Asia > China > Guangdong Province > Shenzhen (0.44)
- Asia > China > Heilongjiang Province > Harbin (0.24)
- Asia > China > Jiangsu Province > Nanjing (0.06)
- (7 more...)
- Overview (1.00)
- Research Report > New Finding (0.68)
- Research Report > Experimental Study (0.46)
LLMQuoter: Enhancing RAG Capabilities Through Efficient Quote Extraction From Large Contexts
Bezerra, Yuri Facanha, Weigang, Li
We introduce LLMQuoter, a lightweight, distillation-based model designed to enhance Retrieval Augmented Generation (RAG) by extracting the most relevant textual evidence for downstream reasoning tasks. Built on the LLaMA-3B architecture and fine-tuned with Low-Rank Adaptation (LoRA) on a 15,000-sample subset of HotpotQA, LLMQuoter adopts a "quote-first-then-answer" strategy, efficiently identifying key quotes before passing curated snippets to reasoning models. This workflow reduces cognitive overhead and outperforms full-context approaches like Retrieval-Augmented Fine-Tuning (RAFT), achieving over 20-point accuracy gains across both small and large language models. By leveraging knowledge distillation from a high-performing teacher model, LLMQuoter achieves competitive results in a resource-efficient fine-tuning setup. It democratizes advanced RAG capabilities, delivering significant performance improvements without requiring extensive model retraining. Our results highlight the potential of distilled quote-based reasoning to streamline complex workflows, offering a scalable and practical solution for researchers and practitioners alike.
- South America > Brazil > Federal District > Brasília (0.04)
- Asia > China > Jiangsu Province > Xuzhou (0.04)
- North America > United States > New Hampshire (0.04)
- (7 more...)
- Transportation > Ground > Rail (1.00)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
- Transportation > Passenger (0.69)
STARFormer: A Novel Spatio-Temporal Aggregation Reorganization Transformer of FMRI for Brain Disorder Diagnosis
Dong, Wenhao, Li, Yueyang, Zeng, Weiming, Chen, Lei, Yan, Hongjie, Siok, Wai Ting, Wang, Nizhuan
Many existing methods that use functional magnetic resonance imaging (fMRI) classify brain disorders, such as autism spectrum disorder (ASD) and attention deficit hyperactivity disorder (ADHD), often overlook the integration of spatial and temporal dependencies of the blood oxygen level-dependent (BOLD) signals, which may lead to inaccurate or imprecise classification results. To solve this problem, we propose a Spatio-Temporal Aggregation eorganization ransformer (STARFormer) that effectively captures both spatial and temporal features of BOLD signals by incorporating three key modules. The region of interest (ROI) spatial structure analysis module uses eigenvector centrality (EC) to reorganize brain regions based on effective connectivity, highlighting critical spatial relationships relevant to the brain disorder. The temporal feature reorganization module systematically segments the time series into equal-dimensional window tokens and captures multiscale features through variable window and cross-window attention. The spatio-temporal feature fusion module employs a parallel transformer architecture with dedicated temporal and spatial branches to extract integrated features. The proposed STARFormer has been rigorously evaluated on two publicly available datasets for the classification of ASD and ADHD. The experimental results confirm that the STARFormer achieves state-of-the-art performance across multiple evaluation metrics, providing a more accurate and reliable tool for the diagnosis of brain disorders and biomedical research. The codes will be available at: https://github.com/NZWANG/STARFormer.
Aided design of bridge aesthetics based on Stable Diffusion fine-tuning
Zhang, Leye, Tian, Xiangxiang, Zhang, Chengli, Zhang, Hongjun
Stable Diffusion fine-tuning technique is tried to assist bridge-type innovation. The bridge real photo dataset is built, and Stable Diffusion is fine tuned by using four methods that are Textual Inversion, Dreambooth, Hypernetwork and Lora. All of them can capture the main characteristics of dataset images and realize the personalized customization of Stable Diffusion. Through fine-tuning, Stable Diffusion is not only a drawing tool, but also has the designer's innovative thinking ability. The fine tuned model can generate a large number of innovative new bridge types, which can provide rich inspiration for human designers. The result shows that this technology can be used as an engine of creativity and a power multiplier for human designers.
- Asia > China > Jiangsu Province > Lianyungang (0.05)
- Asia > China > Shandong Province > Qingdao (0.05)
- North America > Mexico > Gulf of Mexico (0.04)
- (2 more...)
Question answering system of bridge design specification based on large language model
Zhang, Leye, Tian, Xiangxiang, Zhang, Hongjun
This paper constructs question answering system for bridge design specification based on large language model. Three implementation schemes are tried: full fine-tuning of the Bert pretrained model, parameter-efficient fine-tuning of the Bert pretrained model, and self-built language model from scratch. Through the self-built question and answer task dataset, based on the tensorflow and keras deep learning platform framework, the model is constructed and trained to predict the start position and end position of the answer in the bridge design specification given by the user. The experimental results show that full fine-tuning of the Bert pretrained model achieves 100% accuracy in the training-dataset, validation-dataset and test-dataset, and the system can extract the answers from the bridge design specification given by the user to answer various questions of the user; While parameter-efficient fine-tuning of the Bert pretrained model and self-built language model from scratch perform well in the training-dataset, their generalization ability in the test-dataset needs to be improved. The research of this paper provides a useful reference for the development of question answering system in professional field.
Economic span selection of bridge based on deep reinforcement learning
Zhang, Leye, Tian, Xiangxiang, Zhang, Chengli, Zhang, Hongjun
Deep Q-network algorithm is used to select economic span of bridge. Selection of bridge span has a significant impact on the total cost of bridge, and a reasonable selection of span can reduce engineering cost. Economic span of bridge is theoretically analyzed, and the theoretical solution formula of economic span is deduced. Construction process of bridge simulation environment is described in detail, including observation space, action space and reward function of the environment. Agent is constructed, convolutional neural network is used to approximate Q function,{\epsilon} greedy policy is used for action selection, and experience replay is used for training. The test verifies that the agent can successfully learn optimal policy and realize economic span selection of bridge. This study provides a potential decision-making tool for bridge design.
- Transportation > Ground (0.36)
- Materials > Construction Materials (0.36)
- Leisure & Entertainment > Games (0.36)
Kinematic analysis of structural mechanics based on convolutional neural network
Zhang, Leye, Tian, Xiangxiang, Zhang, Hongjun
Attempt to use convolutional neural network to achieve kinematic analysis of plane bar structure. Through 3dsMax animation software and OpenCV module, self-build image dataset of geometrically stable system and geometrically unstable system. we construct and train convolutional neural network model based on the TensorFlow and Keras deep learning platform framework. The model achieves 100% accuracy on the training set, validation set, and test set. The accuracy on the additional test set is 93.7%, indicating that convolutional neural network can learn and master the relevant knowledge of kinematic analysis of structural mechanics. In the future, the generalization ability of the model can be improved through the diversity of dataset, which has the potential to surpass human experts for complex structures. Convolutional neural network has certain practical value in the field of kinematic analysis of structural mechanics. Using visualization technology, we reveal how convolutional neural network learns and recognizes structural features. Using pre-trained VGG16 model for feature extraction and fine-tuning, we found that the generalization ability is inferior to the self-built model.
A Mobile Data-Driven Hierarchical Deep Reinforcement Learning Approach for Real-time Demand-Responsive Railway Rescheduling and Station Overcrowding Mitigation
Liu, Enze, Lin, Zhiyuan, Wang, Judith Y. T., Chen, Hong
Real-time railway rescheduling is an important technique to enable operational recovery in response to unexpected and dynamic conditions in a timely and flexible manner. Current research relies mostly on OD based data and model-based methods for estimating train passenger demands. These approaches primarily focus on averaged disruption patterns, often overlooking the immediate uneven distribution of demand over time. In reality, passenger demand deviates significantly from predictions, especially during a disaster. Disastrous situations such as flood in Zhengzhou, China in 2022 has created not only unprecedented effect on Zhengzhou railway station itself, which is a major railway hub in China, but also other major hubs connected to Zhengzhou, e.g., Xi'an, the closest hub west of Zhengzhou. In this study, we define a real-time demand-responsive (RTDR) railway rescheduling problem focusing two specific aspects, namely, volatility of the demand, and management of station crowdedness. For the first time, we propose a data-driven approach using real-time mobile data (MD) to deal with this RTDR problem. A hierarchical deep reinforcement learning (HDRL) framework is designed to perform real-time rescheduling in a demand-responsive manner. The use of MD has enabled the modelling of passenger dynamics in response to train delays and station crowdedness, and a real-time optimisation for rescheduling of train services in view of the change in demand as a result of passengers' behavioural response to disruption. Results show that the agent can steadily satisfy over 62% of the demand with only 61% of the original rolling stock, ensuring continuous operations without overcrowding. Moreover, the agent exhibits adaptability when transferred to a new environment with increased demand, highlighting its effectiveness in addressing unforeseen disruptions in real-time settings.
- Asia > China > Henan Province > Zhengzhou (0.85)
- Asia > China > Shaanxi Province > Xi'an (0.26)
- Asia > China > Shanghai > Shanghai (0.04)
- (11 more...)